{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Pandas:日期的范围、频率以及移动"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "内容介绍:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1.日期的范围及序列生成"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* 生成日期范围：根据指定的频率生成指定长度的DatatimeIndex\n",
    "\n",
    "函数语法:<br>\n",
    "date_range(start=None, end=None, periods=None, freq=None, tz=None, normalize=False, name=None, closed=None, **kwargs) -> pandas.core.indexes.datetimes.DatetimeIndex\n",
    "\n",
    "* start:开始日期\n",
    "* end:结束日期。如果没有结束日期，则生成从开始到现在的日期序列。\n",
    "* period:固定时期，取值为整数或None\n",
    "* freq:日期偏移量(频率)，取值为string或DateOffset,默认为'D'\n",
    "* name:生成索引对象的名称，取值为string或None"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "DatetimeIndex(['2018-01-01', '2018-01-02', '2018-01-03', '2018-01-04',\n",
       "               '2018-01-05', '2018-01-06', '2018-01-07', '2018-01-08',\n",
       "               '2018-01-09', '2018-01-10', '2018-01-11', '2018-01-12',\n",
       "               '2018-01-13', '2018-01-14', '2018-01-15'],\n",
       "              dtype='datetime64[ns]', freq='D')"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#产生开始日期、结束日期的日期范围，频率默认为天\n",
    "times11 = pd.date_range('2018-1-1','2018-1-15')\n",
    "times11"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "DatetimeIndex(['2018-01-01', '2018-01-02', '2018-01-03', '2018-01-04',\n",
       "               '2018-01-05', '2018-01-06', '2018-01-07', '2018-01-08',\n",
       "               '2018-01-09', '2018-01-10', '2018-01-11', '2018-01-12',\n",
       "               '2018-01-13', '2018-01-14', '2018-01-15', '2018-01-16',\n",
       "               '2018-01-17', '2018-01-18', '2018-01-19', '2018-01-20'],\n",
       "              dtype='datetime64[ns]', freq='D')"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#指定开始日期，和生成的时期。默认按照天生成数据，即freq='D'。默认的第一个时间参数是start,可以指定end和periods\n",
    "times12 = pd.date_range('2018-1-1',periods=20)\n",
    "times12"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "DatetimeIndex(['2018-01-31', '2018-02-28', '2018-03-31', '2018-04-30',\n",
       "               '2018-05-31', '2018-06-30', '2018-07-31', '2018-08-31',\n",
       "               '2018-09-30', '2018-10-31'],\n",
       "              dtype='datetime64[ns]', freq='M')"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#指定开始日期，和生成的时期，修改为月数据。\n",
    "times13 = pd.date_range('2018-1-1',periods=10,freq='M')\n",
    "times13"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "DatetimeIndex(['2018-01-31 10:20:01', '2018-02-28 10:20:01',\n",
       "               '2018-03-31 10:20:01', '2018-04-30 10:20:01',\n",
       "               '2018-05-31 10:20:01', '2018-06-30 10:20:01',\n",
       "               '2018-07-31 10:20:01', '2018-08-31 10:20:01',\n",
       "               '2018-09-30 10:20:01', '2018-10-31 10:20:01'],\n",
       "              dtype='datetime64[ns]', freq='M')"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#产生带时间信息的数据\n",
    "times14 = pd.date_range('2018-1-1 10:20:01',periods=10,freq='M')\n",
    "times14"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "DatetimeIndex(['2018-01-01', '2018-01-02', '2018-01-03', '2018-01-04',\n",
       "               '2018-01-05', '2018-01-06', '2018-01-07', '2018-01-08',\n",
       "               '2018-01-09', '2018-01-10'],\n",
       "              dtype='datetime64[ns]', freq='D')"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#使用规范化参数产生日期的范围\n",
    "#默认情况下保留开始日期时间参数的时间戳。当normalize=True时，产生日期不带时间。\n",
    "times15 = pd.date_range('2018-1-1 10:20:01',periods=10,normalize=True)\n",
    "times15"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.频率组合\n",
    "\n",
    "freq参数使用比较灵活，能够控制更大范围的频率\n",
    "\n",
    "**频率**\n",
    "* 产生日期范围时，各个日期之间的间隔\n",
    "* 由一个基础频率和一个乘数组成,基础频率h和乘数4,组成4h\n",
    "* 基础频率使用字符串别名表示，如'D':天，'M':月\n",
    "* 每个基础频率都有一个对应的日期偏移量对象(data oddset),如H对应hour\n",
    "\n",
    "**举例:**\n",
    "* 4h :每4小时\n",
    "* 2H30min:2小时30分钟\n",
    "* B表示工日的每天\n",
    "* M日历月末\n",
    "* BM月内最后工作日\n",
    "* 时间序列频率较为复杂，建议查看文档，应用时详细研究"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "DatetimeIndex(['2018-01-01 00:00:00', '2018-01-01 04:00:00',\n",
       "               '2018-01-01 08:00:00', '2018-01-01 12:00:00',\n",
       "               '2018-01-01 16:00:00', '2018-01-01 20:00:00',\n",
       "               '2018-01-02 00:00:00', '2018-01-02 04:00:00',\n",
       "               '2018-01-02 08:00:00', '2018-01-02 12:00:00',\n",
       "               '2018-01-02 16:00:00', '2018-01-02 20:00:00',\n",
       "               '2018-01-03 00:00:00', '2018-01-03 04:00:00',\n",
       "               '2018-01-03 08:00:00', '2018-01-03 12:00:00',\n",
       "               '2018-01-03 16:00:00', '2018-01-03 20:00:00',\n",
       "               '2018-01-04 00:00:00', '2018-01-04 04:00:00',\n",
       "               '2018-01-04 08:00:00', '2018-01-04 12:00:00',\n",
       "               '2018-01-04 16:00:00', '2018-01-04 20:00:00',\n",
       "               '2018-01-05 00:00:00'],\n",
       "              dtype='datetime64[ns]', freq='4H')"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#每间隔4小时\n",
    "times21 = pd.date_range('2018-1-1','2018-1-5',freq='4h')\n",
    "times21"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "DatetimeIndex(['2018-01-01 00:00:00', '2018-01-01 02:30:00',\n",
       "               '2018-01-01 05:00:00', '2018-01-01 07:30:00',\n",
       "               '2018-01-01 10:00:00', '2018-01-01 12:30:00',\n",
       "               '2018-01-01 15:00:00', '2018-01-01 17:30:00',\n",
       "               '2018-01-01 20:00:00', '2018-01-01 22:30:00',\n",
       "               '2018-01-02 01:00:00', '2018-01-02 03:30:00',\n",
       "               '2018-01-02 06:00:00', '2018-01-02 08:30:00',\n",
       "               '2018-01-02 11:00:00', '2018-01-02 13:30:00',\n",
       "               '2018-01-02 16:00:00', '2018-01-02 18:30:00',\n",
       "               '2018-01-02 21:00:00', '2018-01-02 23:30:00'],\n",
       "              dtype='datetime64[ns]', freq='150T')"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#每间隔2小时30分钟\n",
    "times22 = pd.date_range('2018-1-1','2018-1-3',freq='2h30min')\n",
    "times22"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "DatetimeIndex(['2021-01-15', '2021-02-19', '2021-03-19', '2021-04-16',\n",
       "               '2021-05-21', '2021-06-18', '2021-07-16', '2021-08-20',\n",
       "               '2021-09-17'],\n",
       "              dtype='datetime64[ns]', freq='WOM-3FRI')"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#月中某星期：'WOM'，每个月的第三个星期五：'WOM-3FRI'\n",
    "#获取一段日期中每个月的第三个星期五\n",
    "fri3 = pd.date_range('2021-1-1','2021-10-10',freq='WOM-3FRI')\n",
    "fri3"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3.日期偏移量\n",
    "\n",
    "* 可以使用日期偏移量方法生成日期连续索引数据\n",
    "* 可以生成同频率组合相同的数据"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 具体见2中的相关示例。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Help on function date_range in module pandas.core.indexes.datetimes:\n",
      "\n",
      "date_range(start=None, end=None, periods=None, freq=None, tz=None, normalize=False, name=None, closed=None, **kwargs) -> pandas.core.indexes.datetimes.DatetimeIndex\n",
      "    Return a fixed frequency DatetimeIndex.\n",
      "    \n",
      "    Parameters\n",
      "    ----------\n",
      "    start : str or datetime-like, optional\n",
      "        Left bound for generating dates.\n",
      "    end : str or datetime-like, optional\n",
      "        Right bound for generating dates.\n",
      "    periods : int, optional\n",
      "        Number of periods to generate.\n",
      "    freq : str or DateOffset, default 'D'\n",
      "        Frequency strings can have multiples, e.g. '5H'. See\n",
      "        :ref:`here <timeseries.offset_aliases>` for a list of\n",
      "        frequency aliases.\n",
      "    tz : str or tzinfo, optional\n",
      "        Time zone name for returning localized DatetimeIndex, for example\n",
      "        'Asia/Hong_Kong'. By default, the resulting DatetimeIndex is\n",
      "        timezone-naive.\n",
      "    normalize : bool, default False\n",
      "        Normalize start/end dates to midnight before generating date range.\n",
      "    name : str, default None\n",
      "        Name of the resulting DatetimeIndex.\n",
      "    closed : {None, 'left', 'right'}, optional\n",
      "        Make the interval closed with respect to the given frequency to\n",
      "        the 'left', 'right', or both sides (None, the default).\n",
      "    **kwargs\n",
      "        For compatibility. Has no effect on the result.\n",
      "    \n",
      "    Returns\n",
      "    -------\n",
      "    rng : DatetimeIndex\n",
      "    \n",
      "    See Also\n",
      "    --------\n",
      "    DatetimeIndex : An immutable container for datetimes.\n",
      "    timedelta_range : Return a fixed frequency TimedeltaIndex.\n",
      "    period_range : Return a fixed frequency PeriodIndex.\n",
      "    interval_range : Return a fixed frequency IntervalIndex.\n",
      "    \n",
      "    Notes\n",
      "    -----\n",
      "    Of the four parameters ``start``, ``end``, ``periods``, and ``freq``,\n",
      "    exactly three must be specified. If ``freq`` is omitted, the resulting\n",
      "    ``DatetimeIndex`` will have ``periods`` linearly spaced elements between\n",
      "    ``start`` and ``end`` (closed on both sides).\n",
      "    \n",
      "    To learn more about the frequency strings, please see `this link\n",
      "    <https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases>`__.\n",
      "    \n",
      "    Examples\n",
      "    --------\n",
      "    **Specifying the values**\n",
      "    \n",
      "    The next four examples generate the same `DatetimeIndex`, but vary\n",
      "    the combination of `start`, `end` and `periods`.\n",
      "    \n",
      "    Specify `start` and `end`, with the default daily frequency.\n",
      "    \n",
      "    >>> pd.date_range(start='1/1/2018', end='1/08/2018')\n",
      "    DatetimeIndex(['2018-01-01', '2018-01-02', '2018-01-03', '2018-01-04',\n",
      "                   '2018-01-05', '2018-01-06', '2018-01-07', '2018-01-08'],\n",
      "                  dtype='datetime64[ns]', freq='D')\n",
      "    \n",
      "    Specify `start` and `periods`, the number of periods (days).\n",
      "    \n",
      "    >>> pd.date_range(start='1/1/2018', periods=8)\n",
      "    DatetimeIndex(['2018-01-01', '2018-01-02', '2018-01-03', '2018-01-04',\n",
      "                   '2018-01-05', '2018-01-06', '2018-01-07', '2018-01-08'],\n",
      "                  dtype='datetime64[ns]', freq='D')\n",
      "    \n",
      "    Specify `end` and `periods`, the number of periods (days).\n",
      "    \n",
      "    >>> pd.date_range(end='1/1/2018', periods=8)\n",
      "    DatetimeIndex(['2017-12-25', '2017-12-26', '2017-12-27', '2017-12-28',\n",
      "                   '2017-12-29', '2017-12-30', '2017-12-31', '2018-01-01'],\n",
      "                  dtype='datetime64[ns]', freq='D')\n",
      "    \n",
      "    Specify `start`, `end`, and `periods`; the frequency is generated\n",
      "    automatically (linearly spaced).\n",
      "    \n",
      "    >>> pd.date_range(start='2018-04-24', end='2018-04-27', periods=3)\n",
      "    DatetimeIndex(['2018-04-24 00:00:00', '2018-04-25 12:00:00',\n",
      "                   '2018-04-27 00:00:00'],\n",
      "                  dtype='datetime64[ns]', freq=None)\n",
      "    \n",
      "    **Other Parameters**\n",
      "    \n",
      "    Changed the `freq` (frequency) to ``'M'`` (month end frequency).\n",
      "    \n",
      "    >>> pd.date_range(start='1/1/2018', periods=5, freq='M')\n",
      "    DatetimeIndex(['2018-01-31', '2018-02-28', '2018-03-31', '2018-04-30',\n",
      "                   '2018-05-31'],\n",
      "                  dtype='datetime64[ns]', freq='M')\n",
      "    \n",
      "    Multiples are allowed\n",
      "    \n",
      "    >>> pd.date_range(start='1/1/2018', periods=5, freq='3M')\n",
      "    DatetimeIndex(['2018-01-31', '2018-04-30', '2018-07-31', '2018-10-31',\n",
      "                   '2019-01-31'],\n",
      "                  dtype='datetime64[ns]', freq='3M')\n",
      "    \n",
      "    `freq` can also be specified as an Offset object.\n",
      "    \n",
      "    >>> pd.date_range(start='1/1/2018', periods=5, freq=pd.offsets.MonthEnd(3))\n",
      "    DatetimeIndex(['2018-01-31', '2018-04-30', '2018-07-31', '2018-10-31',\n",
      "                   '2019-01-31'],\n",
      "                  dtype='datetime64[ns]', freq='3M')\n",
      "    \n",
      "    Specify `tz` to set the timezone.\n",
      "    \n",
      "    >>> pd.date_range(start='1/1/2018', periods=5, tz='Asia/Tokyo')\n",
      "    DatetimeIndex(['2018-01-01 00:00:00+09:00', '2018-01-02 00:00:00+09:00',\n",
      "                   '2018-01-03 00:00:00+09:00', '2018-01-04 00:00:00+09:00',\n",
      "                   '2018-01-05 00:00:00+09:00'],\n",
      "                  dtype='datetime64[ns, Asia/Tokyo]', freq='D')\n",
      "    \n",
      "    `closed` controls whether to include `start` and `end` that are on the\n",
      "    boundary. The default includes boundary points on either end.\n",
      "    \n",
      "    >>> pd.date_range(start='2017-01-01', end='2017-01-04', closed=None)\n",
      "    DatetimeIndex(['2017-01-01', '2017-01-02', '2017-01-03', '2017-01-04'],\n",
      "                  dtype='datetime64[ns]', freq='D')\n",
      "    \n",
      "    Use ``closed='left'`` to exclude `end` if it falls on the boundary.\n",
      "    \n",
      "    >>> pd.date_range(start='2017-01-01', end='2017-01-04', closed='left')\n",
      "    DatetimeIndex(['2017-01-01', '2017-01-02', '2017-01-03'],\n",
      "                  dtype='datetime64[ns]', freq='D')\n",
      "    \n",
      "    Use ``closed='right'`` to exclude `start` if it falls on the boundary.\n",
      "    \n",
      "    >>> pd.date_range(start='2017-01-01', end='2017-01-04', closed='right')\n",
      "    DatetimeIndex(['2017-01-02', '2017-01-03', '2017-01-04'],\n",
      "                  dtype='datetime64[ns]', freq='D')\n",
      "\n"
     ]
    }
   ],
   "source": [
    "help(pd.date_range)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
